- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.5k
Add Regex.to_embed/2 #14379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Regex.to_embed/2 #14379
Conversation
| Thank you for the PR. Given  
 So my suggestion is to add a single function, called  | 
7ab69cd    to
    0bef2c1      
    Compare
  
    | Hi, I have updated the change as you requested. I made one tweak, adding a "strict" option for  FWIW, More modern versions of PCRE may one day support more options that just 'imsx'. Perl itself supports 'u' in embeddable form (although it has slightly different meaning, in Elixir/Erlang/PCRE the /u flag means "string encoded as unicode" and also "use unicode semantics". In the Perl /u means "use unciode semantics regardless of the encoding". This is why the exceptions mentions the current version of PCRE. Hope this is what you had in mind with your feedback! | 
0bef2c1    to
    092407e      
    Compare
  
    to_embed(regex,strict) returns an embeddable representation of regex. For instance ~r/foo/i can be represented as ~r/(?i-msx:foo)/. If the option :strict is true (the default) then it will throw an ArgumentError if the regex was compiled with an option/modifier which cannot be represented as an embeddable pattern. If :strict is false then any unembeddable options will be silently ignored. This may be perfectly reasonable, for intance the wrapped pattern may be compiled with the same modifiers as the pattern, or reusing the pattern without the unembeddable modifiers may not change its semantics.
092407e    to
    9bce632      
    Compare
  
    | Looks great, I have dropped only some minor suggestions now and we can ship it! | 
Minor fixups and simplifications. Co-authored-by: José Valim <[email protected]>
* Sentences should not start with 'And'. * Rework sentence about unlisted regex compile options. * Consistent formatting for the the 'strict' option.
also add comment about why we sort the modifiers
| Suggestions turned to commits, with one caveat about a compromise wording as noted, and I followed up on your point about to_string(). I didnt squash so its easier for you to review, you said previously you didnt mind doing that yourself. | 
| 💚 💙 💜 💛 ❤️ | 
This patch, which works on 1.18.x but does not work on 1.19 is an attempt to implement String.Chars protocol and also a Regex.to_string() and Regex.modifiers() and Regex.to_string!() and Regex.modifiers!() functions.
The idea is to make it possible to safely embed precompiled regexes into other regexes in a similar way as that supported by perl. The general idea is that ~r/foo/x turns into "(?x-ims:foo\n)", and etc. Thus it should match the same as it would have in its original form when it is embedded into a pattern which has a different set of modifiers.
For review by Jose.